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^3( I Abstract. Consistent Query Answering (CQA) is the problem of com- 

'^1^ , puting from a database the answers to a query that are consistent with 

respect to certain integrity constraints that the database, as a whole, may 
fail to satisfy. Consistent answers have been characterized as those that 
are invariant under certain minimal forms of restoration of the database 
QQ ■ consistency. In this paper we investigate algorithmic and complexity the- 

Qoretic issues of CQA under database repairs that minimally depart -wrt 
I the cardinality of the symmetric difference- from the original database. 

C/3 , Research on this kind of repairs has been suggested in the literature, but 

O ■ no systematic study had been done. Here we obtain first tight complexity 

bounds. We also address, considering for the first time a dynamic scenario 
for CQA, the problem of incremental complexity of CQA, that naturally 
►^ ■ occurs when an originally consistent database becomes inconsistent af- 

fvj ' ter the execution of a sequence of update operations. Tight bounds on 

f^ I incremental complexity are provided for various semantics under denial 

f^ , constraints, e.g. (a) minimum tuple-based repairs wrt cardinality, (b) 

^^ ' minimal tuple-based repairs wrt set inclusion, and (c) minimum numer- 

^^ ' ical aggregation of attribute-based repairs. Fixed parameter tractability 

\>0 , is also investigated in this dynamic context, where the size of the update 

f^ ■ sequence becomes the relevant parameter. 

1 Introduction 

Jv>( \ Integrity constraints (ICs) capture the semantics of data and are expected to 

'V^ ■ be satisfied by a database in order to keep its correspondence with the outside 

C^ ' reality it is modelling. However, it is often the case that IC satisfaction cannot 

be guaranteed, and inconsistent database states are common, e.g. in integrated 
databases, census databases, legacy data, etc. [7]. 

Consistent Query Answering (CQA) is the problem of computing from a 
database those answers to a query that are consistent with respect to certain 
ICs, that the database as a whole may fail to satisfy. Consistent answers have 
been characterized as those that are invariant under minimal forms of restoration 
of the database consistency [2]. From this perspective, CQA is a form of cautious 
reasoning from a database under integrity constraints. 

The notion of minimal restoration of consistency was captured in [2] in terms 
of database repairs, i.e. new, consistent database instances that share the schema 
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with the original database, but differ from the latter by a minimal set of whole 
tuples under set inclusion. In [7, 19, 2, 9, 4, 11] complexity bounds for CQA under 
this repair semantics have been reported. However, less attention has received the 
semantics of CQA based on "cardinality-based repairs" of the original database 
that minimize the number of whole database tuples by which the instances 



Example 1. Consider a database schema P{X, Y, Z) with the functional depen- 
dency AT — > F. The inconsistent instance D = {P{a,b,c),P{a,c,d),P{a,c,e)}, 
seen as a set of ground atoms, has two repairs wrt set inclusion, namely Di = 
{P{a,b,c)} and D2 = {P{a,c,d), P{a,c,e)}, because the symmetric set differ- 
ences with the original instance, i.e. A(D, -Di), 'A{D, D2), are minimal under set 
inclusion. However only D2 is a cardinality-based repair, because the cardinality 
|Z\(D, 1)2)1 of the symmetric set difference becomes a minimum. D 

In this paper we address the problem of obtaining complexity bounds for CQA 
under the semantics given by cardinality-based repairs, and we do this by in- 
troducing some graph theoretic techniques and results that, apart from being 
interesting by themselves, have a wider applicability in the context of CQA. Al- 
though research on cardinality- and tuple-based repairs has been proposed and 
started before in the context of CQA [7] , no detailed analysis of their complexity 
theoretic properties has been provided. In [3] a brief illustration was given of how 
to specify cardinality based repairs using logic programs with weak cardinality 
constraints [8] and stable model semantics. 

Our emphasis is on CQA, as opposed to computing or checking repairs. This 
is because we are usually not interested in computing specific repairs (there are 
exceptions though, e.g. in census- like data [6]), but in characterizing and com- 
puting consistent answers to queries. However, the repair semantics we choose 
will have an impact on CQA. 

Example 2. (example 1 continued) The query P{x, y, x)7 has (a, c, d) and (a, c, e) 
as consistent answers under the cardinality semantics (the classic answers in the 
only repair), but none under the set inclusion semantics (there is no classic 
answer shared by the two repairs). D 

All the complexity bounds on CQA given so far in the literature, no matter 
what repair semantics is chosen, consider the static case: Given a snapshot of 
a database, a set of integrity constraints, and a query, the problem is to find 
consistent query answers. However, databases are essentially dynamic structures, 
subject to update operations. In this paper we also take into account dynamic 
aspects of data, studying the complexity of CQA when the consistency of a 
consistent database may be affected by update actions. 

Example 3. (examples 1 and 2 continued) The cardinality-based repair D2 = 
{P{a,, c, d), P{a^ c, e)} is obviously consistent, however after the execution of the 
update operation insert{P{a, f, d)) it becomes inconsistent. In this case, the only 
cardinality repair of i?2U{P(a,/,d)} is D2 itself. So, CQA from £'2U{P(a, /, d)} 
amounts to classic query answering from D2. However, if we start from the 
consistent instance D' = {P{a, c, d)}, executing the same update operation leads 
to two cardinality repairs, namely D' , but also {P(a, /, d)}, and now CQA from 
D' U {P{a, /, d)} is different from classic query answering from D', because two 
repairs have to be taken into account. □ 



In this case, it would be inefficient to compute a materialized repair of the 
database or a consistent answer to the query from scratch after every update. 
In this paper we investigate how a pre-computed repair of the database at a 
previous step or the original instance itself if it was already consistent can be 
used to consistently answer queries after update operations. We provide a unified 
approach to the study of the computational complexity of incremental consistent 
query answering; and not only under cardinality-based repairs, but also under 
other repair semantics, like the classic minimal set inclusion semantics and the 
one based on minimization of changes of attribute values (attribute-based re- 
pairs) that have been already used in the literature [35, 18, 6, 17]. 

Incremental algorithms have been developed to check integrity constraint 
satisfaction [28] . In a similar spirit we find work on incremental database main- 
tenance, i.e. integrity constraint satisfaction, by means of compensating active 
rules [34]. However, to the best of our knowledge, incremental CQA and incre- 
mental repair computation (under updates) have not been treated before. We 
know, by data complexity theoretic reasons, that some first-order queries asking 
for consistent answers cannot be expressed as first-order queries asking for classic 
answers [7, 9, 11, 19]. However, in the incremental, dynamic context consistent 
answers could be expressed as classic answers to first-order queries. A similar 
situation can be found in incremental evaluation of queries that, statically, are 
not expressible in the query language at hand, but incremental computation can 
be performed and expressed [26]. 

Cardinality-based CQA as studied in this paper has interesting properties 
that make it useful as a semantics for CQA. First of all, as illustrated in Ex- 
ample 1, it is clearly the case that every cardinality-based repair is also a set 
inclusion-based repair, but not necessarily the other way around. In consequence, 
the consistent query answers under cardinality repairs form a superset of the con- 
sistent answers under the set inclusion-based semantics. Actually, in situations 
where the latter does not give any answers (c.f. Example 2), the former does 
return answers, which is good. They could be further filtered out according to 
other criteria at a post-processing phase. In extreme cases, when there is only 
one database tuple in semantic conflict with the rest of a possible large set of 
other tuples, the existence of a set inclusion-based repair containing the only 
conflicting tuple would easily lead to an empty set of consistent answers. The 
cardinality-based semantics would not allow such a repair. (Example 4 below 
illustrates this situation.) 

This feature of the cardinality-based repair semantics comes at a price. In 
Section 3 we prove that CQA has a higher data complexity than the classic, 
set inclusion-based semantics, actually P^^('°9("))-hard vs. PTIME for denial 
constraints [11]. On the other side, the cardinality-based semantics has the in- 
teresting property that CQA, a form of cautious (or certain) reasoning (true in 
all repairs) and its brave (or possible) version, i.e. true in some repair, are mu- 
tually poly-time reducible and share the same complexity. This is established in 
Section 3 by proving first some useful graph theoretic lemmas about maximum 
independent sets. This result may not hold for classic CQA. 

Furthermore, we prove in Sections 4.1 and 5 that incremental CQA for con- 
junctive queries under the cardinality-based semantics has a lower complexity 
than incremental CQA for the classic semantics, actually PTIME vs. coAP-hard 



(in data complexity), which makes the cardinahty semantics more appeahng in 
a dynamic setting. 

As we just mentioned, incremental CQA under the cardinality semantics is 
polynomial in data complexity, but naive algorithms arc exponential if the size of 
the update sequence is a part of the input and the combined complexity is con- 
sidered. In Section 4.1 we study the parameterized complexity [15, 22] of CQA, 
actually for the incremental case and under cardinality-based semantics, where 
the parameter is the size of the update sequence. We establish that the problem 
is fixed parameter tractable by providing a concrete parameterized algorithm. 

For comparison with the cardinality-based semantics, we obtain in Section 5 
new results on the static and incremental complexity under the classic semantics 
(i.e. tuple- and set inclusion-based distance) and the attribute-based semantics 
(i.e. attribute value-based and minimization of attribute changes). We prove 
that static CQA for the weighted version of the attribute-based semantics and 
incremental CQA under the attribute-based semantics become both P^-^-hard 
in data. 

We concentrate on relational databases and basically on the class of denial 
integrity constraints, which includes most of the constraints found in applications 
where inconsistencies naturally arise, e.g. census-like databases [6], experimental 
samples databases, biological databases, etc. Complexity results refer to data 
complexity [1] . For complexity theoretic definitions and classic results we refer to 
[31], to [1] for foundations of databases, and to [15] for parameterized complexity. 



2 Preliminaries 

A relational database D can be identified with a finite set of ground atoms of 
the form R(t), where i? is a relation in the database schema T>, and i is a finite 
sequence of constants taken from the underlying database domain U. The ground 
atom R{t) is also called a database tuple} The relational schema T) determines 
a first-order language L{'D) based on the relation names, the elements of lA, 
and extra built-in predicates. In the language L{'D), integrity constraints are 
sentences, and queries are formulas, usually with free variables. We assume in 
this paper that sets IC of ICs are always consistent in the sense that they are 
simultaneously satisfiablc as first-order sentences. A database is consistent wrt 
to a given set of integrity constraints IC if the sentences in IC arc all true in 
D, denoted D \= IC. An answer to a query Q{x), with free variables x, is a 
tuple t that makes Q true in D when the variables in x arc interpreted as the 
corresponding values in i, denoted D \= Q\t\. 

For a database D, possibly inconsistent with respect to IC , the consistent 
answers to a query Q from D wrt IC are characterized as those answers that 
are invariant under all minimal forms of restoration of consistency for £), where 
minimality refers to some sort of distance between the original instance D and 
alternative consistent instances. 



^ We also use the term tuple to refer to a finite sequence t = (ci, . . . , c„) of constants 
of the database domain W, but a database tuple is a ground atomic sentence with 
predicate in T> (excluding built-ins predicates, like comparisons). 



Definition 1. For a database D, integrity constraints IC and a partial order 
d:D.s over databases depending on the original database D and a repair seman- 
tics S, a repair of D wrt IC under S is an instance D' such that: (a) D' has the 
same schema and domain as D; (b) D' \= IC; and (c) there is no D" satisfying 
(a) and (b), such that D" ^d,s D' , i-e. D" :<d,s D' and not D' <d,s D" ■ The 
set of all repairs is denoted with Rep{D, IC, S). □ 

The class Rep{D, IC,S) depends upon the semantics S, which determines the 
partial order ^ and the way repairs can be obtained, e.g. by allowing both 
insertions and deletions of whole database tuples [2] , or deletions of them only 
[11], or only changes of attribute values [35, 6, 17], etc. (c.f. Definition 3.) 

Definition 2. Let D he a. database, IC a set of ICs, and Q{x) a query, (a) A 
ground tuple i is a consistent answer to Q wrt IC under semantics S if for every 
D' e Rep{DJC,S), D' [= Q[t\. (b) Cqa{Q , D , IC , S) is the set of consistent 
answers to Q in D wrt IC under semantics S. If Q is a sentence (a boolean 
query), Cqa{Q , D , IC , S) := {yes} when D' \= Q for every D' € Rep{D,IC,S), 
and Cqa{Q,D,IC,S) := {no}, otherwise, (c) CQA{Q,IC,S) := {{D,t) \ t G 
Cqa{Q, D, IC ,S)}, the decision problem of consistent query answering. □ 

The decision problem of CQA just defined if for the static case, in the sense that 
it only considers a snapshot of the database. In the literature different notions 
of distance have been considered, they give rise to different repair semantics. We 
summarize here the most common ones, those that will be investigated in this 
work. In the following, A{D' , D) denotes the symmetric difference (D' \ D) U 
{D \ D') of two database instances conceived both as set of ground atoms. 

Definition 3. (minimality semantics) (a) Minimal set inclusion semantics 
(simply, S-repair semantics) [2]: D' < D" ■.<=^ A{D',D) C A{D",D). (b) Mini- 
mum cardinality set semantics (C-repair semantics): D' < D" :<^==> \A{D' , D)\ < 
\A{D" , D)\. (c) Aggregate attribute difference semantics (A-repair semantics) 
minimizes a numerical aggregation function over attribute changes throughout 
the database. □ 

Particular classes of A-repairs can be found in [18, 17], where the aggregation 
function to be minimized is the number of all attribute changes; and in [6], 
where the function is the overall quadratic difference obtained from the changes 
in numerical attributes between the original database and the repair. 

S-repairs and C-repairs are examples of tuple-based repairs, in the sense that 
consistency is restored by inserting and/or deleting database tuples. A-repairs 
are attribute-based repairs, under which database instances can be repaired by 
changing attributes values in existing tuples only. Classes of attribute-based 
repairs have been studied in [35, 18, 6, 17]. Another notion of attribute-based 
repair, not explored so far, and not included in Definition 3, could minimize, set- 
theoretically, the set of attribute changes, with priorities imposed on attributes. 
We will consider other repair semantics later on (e.g. Definition 5) and particular 
cases of attribute-based repairs (c.f. Section 5.2). 

It is easy to prove that every C-repair is an S-repair; and consequently every 
consistent query answer under the S-semantics is a consistent query answer under 



the C-semantics. However, as Example 1 shows, not every S-repair is an C- 
repair. In that example, attribute-based repairs could be {P{a,c,c), P{a,c,d)^ 
P{a, c, e)}, suggesting that we we made a mistake in the second argument of the 
first tuple, but also {P{a, b, c), P{a, b, d), P{a, b,e)}. If the aggregate function in 
Definition 3(c) is the number of changes in attribute values, the former would 
be a repair, but not the latter. These instances are neither S- nor C-repairs if 
the changes of attribute values have to be simulated via deletions followed by 
insertions. 

Integrity constraints may be any first-order sentences written in language 
L{T>), but most of our results refer to denial constraints only. 

Definition 4. Denial constraints are integrity constraints of the form V.T-i(y4i A 
... A Am A 7), where each Ai is a database atom and 7 is a conjmiction of 
comparison atoms. □ 

Notice that functional dependencies (FDs), e.g. '^xVyWz^{R{x, y) A R{x, z) t\y ^ 
z), are binary denial constraints; and range constraints are one-database atom 
denials. For denial ICs, tuple-based repairs arc obtained by tuple deletions only 

In this paper we concentrate on data complexity. We briefly recall some of 
the complexity classes used in this paper. FP is a class of functional problems 
associated with languages in the class P of decision problems that are solvable 
in polynomial time. P^^ (or Z\^) is the class of decision problems solvable in 
polynomial time by a machine that makes calls to an NP oracle. p^-P^offf")) 
is similarly defined, but the number of calls is logarithmic. It is not known if 
pNP{iog{n)) ig strictly contained in P^^ . The functional class ii^p^^('°9(")) is 
similarly defined. The class A^{log{n)) contains decision problems that can be 
solved by a polynomial time machine that makes a logarithmic number of calls 
to an oracle in S2 ■ 



3 Complexity of CQA under the C-Repair Semantics 

CQA under the minimal cardinality repair semantics (C-repair semantics in Def- 
inition 3(b)) has received less attention in the literature than the same problem 
under the S-repair semantics. An exception is [3], where C-repairs were specified 
using logic programs with non-prioritized weak constraints under the skeptical 
stable model semantics [8]. As a consequence, from results in [8] (c.f. also [25]), 
we obtain that an upper bound on the data complexity of CQA under the C- 
repair semantics is the class A^{log{n)). In this section we investigate the static 
complexity of tuple-based CQA under the C-repair semantics. 

In [4] , conflict graphs were first introduced to study the complexity of CQA 
for aggregate queries wrt FDs under the S-repair semantics. They have as vertices 
the database tuples and edges connect two tuples that simultaneously violate a 
FD. There is a one-to-one correspondence between S-repairs of the database and 
the set-theoretically maximal independent sets (simply called maximal indepen- 
dent sets) in the conflict graph. Similarly, there is a one-to-one correspondence 
between C-repairs and maximum independent sets in the same graph (but now 
they are maximum in cardinality). 



Notice that, unless an IC forces a particular tuple not to belong to the 
database^, every tuple in the original database belongs to some S-repair, but 
not necessarily to a C-repair (c.f. Example 1, where the tuple P{a, b, c) does not 
belong to the only C-repair). In consequence, testing membership of vertices to 
some maximum independent set becomes a problem that is relevant to address. 
For this purpose we will make good use of some graph theoretic constructions 
and results about maximum independent sets obtained from them, whose proofs 
use a self-rcducibility property of independent sets that can be expressed as 
follows: For any graph G and vertex v, every maximum independent set that 
contains v (meaning maximum among the independent sets that contain v) con- 
sists of vertex v together with a maximum independent set of the graph G" that 
is obtained from G by deleting all vertices adjacent to v. 

Lemma 1. Given a graph G and a vertex v in it, a graph G" that extends G 
can be constructed in polynomial time in the size of G, such that there is a 
maximum independent set I oi G containing u iff u belongs to every maximum 
independent set of G" iff the sizes of maximum independent sets in G and G" 
differ by one. □ 

Actually the graph G' in this lemma can be obtained by adding a new vertex v' 
that is connected only to the neighbors of v. Conversely, the following holds 

Lemma 2. For every graph G and vertex v there is a graph G' that can be 
constructed in polynomial time in the size of G, such that v belongs to all 
maximum independent sets of G iff v belongs to some maximum independent 
set of G'. D 

From the lemmas and the membership to FP ^ °9("') of computing the size of 
a maximum clique in a graph [24] , wc obtain 

Proposition 1. The problems of deciding for a vertex in a graph if it belongs to 
some maximum independent set and if it belongs to all maximum independent 
set are both in pNP(iog(n)) ^ □ 

Since a ground atomic query is consistently true when it belongs, as a database 
tuple, i.e. as a vertex in the conflict graph, to all the maximum independent sets 
of the conflict graph, we obtain 

Corollary 1. For functional dependencies and ground atomic queries, CQA 
under the C-repair semantics belongs to p^'P('°9(")). n 

Considering the maximum independent sets (or the C-repairs) as a collection 
of possible worlds, the previous lemmas show a close connection between the 
certain and possible C-semantics, that sanctions something as true if it is true 
in every (the default for CQA), resp. some possible world. CQA under these 
semantics and functional dependencies are polynomially reducible to each other, 
and also share the same complexity. 

Using this result, it is possible to extend Corollary 1 to negative atomic 
queries because p^'P('°5(")) is closed under complement: Notice that a vertex 

^ We do not consider in this work such non generic ICs [7]. 



does not belongs to any niaximuni independent sets means that the certain 
answer to the corresponding negated query is yes and the possible answer to the 
corresponding atomic query is no (or better, false in the sense that it is false 
in every C-repair). On the other side, that there is a maximum independent 
set to which a vertex does not belong means that the possible answer to the 
corresponding negated query is yes and the certain answer to the positive query 
is no. Corollary 1 also holds for queries that are conjunctions of atoms. 

The next result shows that graphs with their maximum independent sets can 
be uniformly encoded as database repair problems under the C-semantics. 

Proposition 2. There is a fixed database schema 2? and a denial constraint (p 
in L{'D), such that for every graph G, there is an instance D over V, whose C- 
repairs wrt ip are in one-to-one correspondence with the maximum independent 
sets of G. Furthermore, D can be built in polynomial time in the size of G. □ 

This proposition is a representation result, of the maximum independent sets 
of a graph as the C-repairs of an inconsistent database wrt a denial constraint. 
This is interesting, because conflict graphs for databases wrt denial constraints 
are actually conflict hypergraphs [11] that have as vertices the database tuples, 
and as hyperedges the (set theoretically minimal) collections of tuples that si- 
multaneously violate one of the denial constraints. 

The correspondence for conflict graphs between repairs and independent sets 
-maximum or maximal depending on the semantics- still holds for hypergraphs, 
where an independent set in an hypergraph is a set of vertices that does not 
contain any hyperedges [11]. Lemmas 1 and 2 and Proposition 1 still hold for 
hypergraphs, and in consequence the polynomial time mutual reducibility be- 
tween the certain and possible semantics for CQA still holds for denial constraints 
and ground atomic queries. 

From Proposition 2 and the P^^^'°^'"^'-completeness of determining the size 
of a maximum clique [24] , we obtain 

Corollary 2. Determining the size of a C-repair for denial constraints is com- 
plete for P'^^('°9("». a 

Using the hypergraph representation of C-repairs, it is possible to generalize 
Corollary 1 to the case of denial constraints and queries that are conjunctions 
of atoms. 

Proposition 3. For denial constraints and non-existentially quantified conjunc- 
tive queries, CQA under the C-repair semantics belongs to p^-Pfofff")). n 

This result can be generalized to conjunctive queries containing negation, but no 
quantifiers. For example, the query Q = Ai/\- ■ -AAiA-iAi+i • • -A^A/j, is true iff 
each positive conjunct is contained in every maximum independent set, and each 
atom preceded by a negation is not contained in any maximum independent sets. 
Deciding if a database tuple A is not contained in any maximum independent 
sets of a graph G is in FP ' °ff(")) ^ because it is the complement to the problem 
of deciding if a database tuple A is contained in some maximum independent 
sets of a graph, and FP *■ °3y^>' is closed under complement. 




In order to obtain hardness for 
CQA under the C-repair semantics, we 
need a useful graph theoretic construc- 
tion, the block Bk{G,t) (c.f. Figure 1), 
consisting of two copies Gi,G2 of G, 
and two internally disconnected sub- 
graphs Ik, Ik+i, with k and fc + 1 ver- 
tices, resp. Every vertex in G (G') is 
connected to every vertex in Ik (resp. 
h+i). 



Figure 1. The block Bfc (G, i) 

Lemma 3. {the block construction) Given a graph G and a number k, there 
exists a graph Bk{G, t), where t is a distinguished vertex in it, such that t belongs 
to all maximum independent sets of Bk{G,t) iff the cardinality of a maximum 
independent set of G is equal to fc. Bk(G,t) can be computed in polynomial 
time in the size of G. □ 

Proposition 4. Deciding if a vertex belongs to all maximum independent sets 
of a graph is P^^('°9("))-hard. D 

This result can be proved by reduction from the following P''^-^('°^'"'^-complcte 
decision problem [24]: Given a graph G and an integer fc, is the size of a maximum 
clique in G equivalent to mod fc? G is reduced to a graph that is built by 
combining a number of versions of the block construction in Figure 1. Now, 
using Proposition 2, the graph constructed for the reduction in Proposition 4 
can be represented as a database consistency problem, and in this way we obtain 

Theorem 1. For denial constraints, CQA for ground atomic queries under the 
C-repair semantics is P^^'^'°^'^"^)-completc. D 

This theorem is interesting, because CQA for denial constraints, but S-semantics 
is in PTIME for arbitrary ground atomic queries [11]; and also because query 
answering under the S-semantics in the context of belief revision/update is more 
complex than the same problem for C-semantics (assuming the polynomial hi- 
erarchy does not collapse); more precisely Winslett's framework [12] (based on 
set inclusion) is TJ^^-complete, while Dalal's [13] (based on set cardinality) is 
pAfP(Zog(n))_pQj^p^g^g [16]. Connectious between CQA and belief revision were 
already established in [2]. Notice that our complexity results do not follow, at 
least not straightforwardly from the results for belief revision presented in [16]. 
They apply in the propositional setting, in combined complexity (as opposed to 
data complexity), and the revision formulas (in our case the constraints) and 
the query do not necessarily satisfy our conditions. 

Now we consider a weighted version of the C-semantics. Under denial con- 
straints, this means that it may be more costly to remove certain tuples than 
others to restore consistency. 



Definition 5. (weC- semantics) Assume that every database tuple R{t) in D 
has an associated numerical cost w{R{i)). D' is a repair of D under the weighted 
•minimum cardinality set semantics if the order relation used in Definition 1 is 
given by Di <d,w D2 :<J=> \DADi\^ < \DAD2\w, where \S\w for a set of 
database tuples S is the sum of the weights of the elements of S. □ 

This semantics is a generalization of the C-semantics, that can be obtained by 
defining all the weights as 1; and as such, it is still tuple-based, actually tuple 
deletion-based for denial constraints. 

Proposition 5. CQA for ground atomic queries wrt denial constraints under 
the weC-repair semantics belongs to P'^^ . D 

4 Incremental Complexity of CQA 

We will consider the following problem of incremental CQA: Assume that we 
have a consistent database D wrt to certain integrity constraints. After an 
update sequence U on D composed of update operations of any of the forms 
insert{R{t)), delete{R{t)) , meaning insert/delete tuple R{t) into/from D, or 
change{R{i),A,a), for changing value of attribute A in R{t) to a, with a (^ U, 
we may obtain an inconsistent database. We are interested in whether we can 
find consistent query answers from the updated inconsistent database more ef- 
ficiently by taking into account the previous consistent database state. We will 
see that for a few particular cases of the consistent query answering problem, the 
knowledge of the previous state may significantly simplify consistent query an- 
swering, while in other cases the worst-case computational complexity of query 
answering is the same as in the classic, static scenario, where no updates are 
considered. 

Definition 6. For a set of integrity constraints IC, a database D that is con- 
sistent wrt /C, and a sequence U of one-tuple database update operations 
Ui, . . . ,Um, incremental consistent query answering for query Q is CQA for 
Q wrt IC from the instance U{D) that results from applying U to D. The 
complexity of incremental CQA is measured wrt the size of D. □ 

In order not to add complexity on the sole basis of the length of the update 
sequence, we will usually assume that m is small in comparison to the size of 
the underlying database I?, say m < c ■ \D\. We arc in general interested in 
data complexity, i.e. wrt \D\. However, in Section 4.1 we consider parameterized 
complexity, where the role of parameter m is considered. Furthermore, we con- 
sider an update sequence U as atomic in the sense that it is completely executed 
or not. In particular, this allows us to concentrate on "minimized" versions of 
update sequences, e.g. containing only insertions and/or attribute changes when 
dealing with denial constraints, because deletions do not have any effects on 
them. 

A notion of incremental complexity has been introduced in [27], and also in 
[23] under the name of dynamic complexity. However, our notion is different. In 
those papers, the instance that is updated can be arbitrary, and the question 
is about the complexity for the updated version when information about the 
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previous instance can be used. In our case, we are assuming that the initial 
database is consistent, and then the problem of finding its repairs is trivial (the 
only repair is the database itself) and CQA is easy (just query the database as 
usual). However, the same problems for the updated database are not necessarily 
trivial or easy. Furthermore, as opposed to [27, 23], where new incremental or 
dynamic complexity classes are introduced, we appeal to those classic complexity 
classes found at a low level in the polynomial hierarchy, which are applied to 
data complexity, relative to the size of the initial database. 



4.1 Incremental complexity: C-repair semantics 

In this section we study the computational complexity of CQA under the C- 
semantics over an inconsistent database that is obtained trough a short sequence 
of update operations on a consistent database. 

Proposition 6. Under the C-semantics, incremental CQA for first-order boolean 
queries, denial constraints, and a sequence of atomic updates U : [/i, . . . , t/,„ ap- 
plied to a database D is in PTIME in the size of D. D 

As we saw in Theorem 1, static CQA for denial constraints under the C-semantics 
is a hard problem, in contrast to incremental CQA for the same class of queries 
and constraints. For the latter problem, an upper bound of of 0{m ■ n™) can be 
obtained (c.f. proof of Proposition 6), that is polynomial in the size n of the initial 
database, but exponential in m. So, the problem is tractable in data complexity, 
but the size of the update sequence is in the exponent of n. We are interested 
in determining if a query can be consistently answered in 0{f{m) x n"^), where 
c is a constant and /(m) is a function which depends only on m, and by doing 
so, to isolate the complexity introduced through the update. 

The area of parameterized complexity (or fixed parameter tractability) [15, 
22, 30] provides the right tools to attack this problem. A decision problem with 
inputs of the form {I,p), where p is a distinguished parameter of the input, is 
fixed parameter tractable^ and by definition belongs to the class FPT [15], if it 
can be solved in time 0{f{\p\) ■ |/|'^), where c and the hidden constant do not 
depend on \p\ or |/| and / docs not depend on |/|. 

Definition 7. (parameterized CQA) Given a query Q, a set of ICs IC, and a 
ground tuple t, the parameterized complexity of CQA is the complexity of the 
decision problem CQA^{Q, IC, t) := {{D, U) \ D is an instance, U an update se- 
quence , i is consistent answer to Q in U{D)}, whose parameter is U, and the 
consistency of an answer refers to the C- repairs of U{D). D 

Wc fixed Q, IC and t in the problem definition because, except for the parameter 
C/, we are interested in data complexity. We emphasize that we are considering 
the parameterized version of incremental CQA, and not of CQA in general. 

Proposition 7. Incremental CQA for atomic ground queries and functional de- 
pendencies under the C-repair semantics is in FPT, being the parameter involved 
the size m of the update sequence. □ 
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The vertex cover problem belongs to the class FPT, i.e. there is a polynomial 
time parameterized algorithm that solves it, say VC{G,k), that determines if 
graph G has a vertex cover of size no bigger than k [15], e.g. there is one that 
runs in time 0(1.2852'^ + k ■ n), being n the size of G [10]. 

The algorithm whose existence is claimed in Proposition 7 is essentially as 
follows (c.f. its proof for details): Let G be the conflict graph associated to the 
database obtained after the insertion of m tuples. By binary search, calling each 
time VC{G^ _), it is possible to determine the size of a minimum vertex cover for 
G. This gives us the minimum number of tuples that have to be removed in order 
to restore consistency; and can be done in time 0{log{m) ■ (1.2852™ + m • n)), 
where n is the size of the original database. In order to determine if a tuple i?(i) 
belongs to every maximum independent set, i.e. if it is consistently true, compute 
the size of a minimum vertex cover for G \ {i?(i)}. The two numbers are the 
same iff the answer is yes. The total time is still 0{log{m) ■ (1.2852™ + m ■ n))), 
which is linear in the size of the original database. 

The same algorithm applies if, in addition to tuple insertions, wc also have 
changes of attribute values in the update part; of course, still under the C-repair 
semantics. 

Having established the fixed parameter tractability of CQA, it becomes rel- 
evant to find better parameterized algorithms to solve this problem. The proof 
of Proposition 7 uses the membership to FPT of the vertex cover problem for 
graphs [15]. That is why we restricted ourselves to functional dependencies, that 
are associated to conflict graphs. However, the result can be extended to denials 
constraints. In fact, in this case we have conflict hypergraphs, but the maximum 
size of an hyperedge is the maximum number of database atoms in a denial 
constraint, which is determined by the fixed database schema. If this number is 
d, then we are in the presence of the so-called d-hitting set problem, consisting 
in finding the size of a minimum hitting set for an hypergraph with hypcredges 
bounded in size by d. This problem is in FPT [29]. 

Theorem 2. Incremental CQA for atomic ground queries and denial constraints 
under the C-repair semantics is in the class FPT, being the parameter involved 
the size of the update sequence. D 

The membership to FPT can be extended to the incremental CQA under the 
possible semantics that sanctions as true what is true of some C-repair. This 
is due to the reduction of this semantics to the certain semantics exhibited in 
Section 3, which requires the introduction of only a few extra vertices (this also 
holds for denial constraints and their hypergraphs). 

In a different direction, incremental CQA, considered as a parameterized 
problem in the size of the update, becomes MONOTONE H^[l]-hard [15], where 
the class MONOTONE W[l] is defined as W[l], but in terms of monotone cir- 
cuits [14]. This result uses a uniform and parameterized reduction from the 
MONOTONE M^[i]-hard problem [14] WEIGHTED MONOTONE 3CNF SAT: 

Proposition 8. The parameterized complexity of CQA wrt denial constraints 
under C-repair semantics is MONOTONE H/[l]-hard. □ 

Since MONOTONE W[l] coincides with the class FPT [15], we conclude that 
incremental CQA under this setting is FPT-hard. 
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5 Incremental Complexity of CQA: Other Semantics 

In this section we consider the S-repair semantics based on set difference of 
tuples, and the A-repair semantics based on changes of attribute values (c.f. 
Definition 3). 

5.1 S-repair semantics 

Incremental CQA for non-quantified conjunctive queries under denial constraints 
belongs to PTIME, which can be established by applying the algorithm in [11] 
for the static case. However, for quantified conjunctive queries the incremental 
CQA for the S-repair semantics is not in PTIME anymore, which contrast to 
incremental CQA for the C-repair semantics (c.f. Proposition 6). In fact, by 
reduction from static CQA for conjunctive queries and denial ICs under the 
S-repair semantics, which is coNP-hard [11], we obtain 

Proposition 9. Under the S-rcpair semantics, incremental CQA for conjunctive 
queries and denial constraints is coA^P-hard. □ 

Despite the fact that static CQA is harder for the C-repair semantics than for 
the S-repair semantics for denial constraints (^P^^('-°9{n)) yg^ coA'P-hard), in- 
cremental CQA under the S-repair semantics is harder than the same problem 
under the C-repair semantics. The reason is that for the C-repair semantics the 
cost of a repair cannot exceed the size of an update, whereas for the S-repair 
semantics the cost of a repair may be unbounded wrt the size of an update. 

Example 4- Consider a schema R{-), S{-) with the denial constraint \/x\fy-'{R{x)A 
iS'(y)); and the consistent database D = {i?(l), . . . , R{n)}, with an empty table 
for S. After the update U = insert{S{0)), the database becomes inconsistent, 
and the S-repairs are {-R(l), . . ■ , R-in)} and {S'(O)}. However, only the former is 
a C-repair, and is at a distance 1 from the original instance, i.e. as the size of 
the update. However, the second S-repair is at a distance n. □ 

5.2 A-repair semantics 

Before addressing the problem of incremental complexity, we give a complexity 
lower bound for the weighted version of static CQA for A-repairs. For this case, 
we need a weight function w that sends triples of the form {R{t), A, newValue), 
where R{t) is a database tuple stored in the database, A is an attribute of R, 
and newValue is a new value for A in R{t), to numerical values. The weighted 
A-repair semantics (wA-repair semantics) is just a particular case of Definition 
3(c), where the distance is given by an aggregation function g applied to the set 
of numbers {w{R{t), A, newValue) \ R{t) £ D}. 

Typically, g is the sum, and the weights are w{R(t), A, newValue) = 1 if 
i?(<)[A] is different from newValue, and otherwise, where i?(t)[^] is the projec- 
tion of database tuple R{t) on attribute A. That is, just the number of changes 
is counted. However, in [6], g is still the sum, but the weight function is given 
by w{R(t),A, newValue) = Qi4-(i?(i)[A] — newValue)^ , where ua is a coefficient 
introduced to capture the relative importance of attribute A or scale factors. 
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Theorem 3. Static CQA for ground atomic queries and denial constraints un- 
der the wA-repair semantics is P^^-hard. D 

To obtain complexity lower bounds for incremental CQA under this repair se- 
mantics, we need first a technical result 

Lemma 4. For any planar graph G with vertices of degree at most 4, there 
exists a regular graph G" of degree 4 that is 4-colorable, such that G" is 3- 
colorable iff G is 3-colorable, and G" can be built in polynomial time in the size 
of G. D 

Notice that graph G, due to its planarity, is 4-colorable. The graph G', is an 
extension of graph G that may not be planar, but preserves 4-Colorability. Now, 
from Lemma 4 and the A'P-hardness of 3-colorability for planar graphs with 
vertices of degree at most 4 [20, theorem 2.3], we obtain 

Corollary 3. 3-Colorability for regular graphs of degree 4 (i.e. with all their 
vertices of exactly degree 4) is A^f-complcte. D 

We use the construction in Lemma 4 as follows: Given any planar graph G of 
degree 4, we construct graph G' as in the lemma, which is regular of degree 4 
and 4-colorable. Its 4-colorability is encoded as a database problem with a fixed 
set of first-order constraints. Since G' is 4-colorable, the database is consistent. 
Furthermore, G' uses all the 4 colors available in the official table of colors, 
as specified by the ICs. In the update part, deleting one of the official colors 
leaves us with the problem of coloring graph G' with only the three remaining 
colors (under an A-repair semantics only changes of colors are allowed to restore 
consistency), which is possible iff the original graph G is 3-colorable. Deciding 
about the latter problem is iVP-complete [20]. We obtain 

Theorem 4. For A-repairs, ground atomic queries, ffist-order ICs, and update 
sequences consisting of tuple deletions, incremental CQA is coiVP-hard. □ 

This result apphes to first-order ICs and delete operations. For incremental CQA 
in general, update operations that introduce violations can be in principle of any 
of the forms insert, delete, c/ian(?e. Of course, the hardness result just obtained 
then trivially applies to general update sequences. 

In order to obtain a hardness result for denial constraints (for which we are 
assuming update sequences do not contain tuple deletions), we can use the kind 
of A-repairs introduced in [6] . 

Theorem 5. Incremental CQA wrt denial constraints and atomic queries under 
the wA-rcpair semantics is P^^-hard. □ 

Under the attribute-based repairs semantics, if the update sequence consist 
of change actions, then we can obtain polynomial time incremental CQA under 
the additional condition that the set of attribute values than can be used to 
restore consistency is bounded in size, independent from the database (or its 
active domain). Such an assumption can be justified in several applications, like 
in census-like databases that are corrected according to inequality-free denial 
constraints that force the new values to be taken in the border of a database 
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independent region [6]; and also in applications where denial constraints, this 
time containing inequalities, force the attribute values to be taken in a finite, pre- 
specified set. The proof is similar to the one of Proposition 6, and the polynomial 
bound now also depends on the size of the set of candidate values. 

Theorem 6. Under A-repairs that can be obtained using values from a database- 
independent bounded set, incremental CQA for first-order boolean queries, de- 
nial constraints, and update sequences containing only change actions is in 
PTIME in the size of the original database. □ 

6 Conclusions 

The dynamic scenario for consistent query answering that considers possible 
updates on a database had not been considered before in the literature. Doing 
incremental CQA on the basis of the original database and the sequence of 
updates is an important and natural problem. Developing algorithms that take 
into account previously obtained consistent answers that are possible cached 
and the updates at hand is a crucial problem for making CQA scale up for real 
database applications. Much research is still needed in this direction. 

In this paper we have concentrated mostly on complexity bounds for this 
problem under different semantics. When we started obtaining results for incre- 
mental CQA under repairs that differ from the original instance by a minimum 
number of tuples, i.e. C-repairs, we realized that this semantics had not been 
sufficiently explored in the literature in the static version of CQA, and that a 
comparison was not possible. In the first part of this paper we studied the com- 
plexity of CQA for this semantics. In doing so, we have developed graph theoretic 
techniques that allow us to connect the certain and possible (or cautious and 
brave) semantics for CQA. 

Our results show that the incremental complexity is lower than the static one 
in several useful cases, but sometimes the complexity cannot be lowered. It is a 
subject of ongoing work the development of concrete and explicit algorithms for 
incremental CQA. Also the complexity of incremental CQA under the alterna- 
tive semantics presented in Section 5 deserves further investigation and a more 
complete picture still has to emerge. 

We obtained the first results about fixed parameter tractability for incremen- 
tal CQA, where the input, for a fixed database schema, can be seen as formed 
by the original database and the update sequence, whose length is a relevant pa- 
rameter. This problem requires additional investigation. It would be interesting 
to examine the area of CQA in general from the point of view of parameterized 
complexity, and not only the incremental case. For example, other natural can- 
didates to be a parameter in the classic, static setting could be: (a) the number 
of inconsistencies in the database, (b) the degree of inconsistency, i.e. the max- 
imum number of violations per database tuple, (c) complexity of inconsistency, 
i.e. the length of the longest path in the conflict graph or hypergraph. These 
parameters may be practically significant since in many applications, like census 
application [6] , inconsistencies are "local" . 

We considered a version of incremental CQA that assumes that the database 
is already consistent before updates are executed, a situation that could have 
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been achieved because no previous updates violated the given semantic con- 
straints or a repaired version was chosen before the new updates were executed. 
We are currently investigating the dynamic case of CQA in the frameworks of 
dynamic complexity [23, 33] or increm,ental complexity as introduced in [27]. In 
this case we start with a database D that is not necessarily consistent -and this 
is the main new issue involved- on which a sequence of basic update operations 
Ui,U2, ■■■,Um is executed. A clever algorithm for CQA may create or update 
intermediate data structures at each atomic update step, to help obtain answers 
at subsequent steps. We are interested in the computational complexity of CQA 
after a sequence of updates, when the data structures created by the query 
answering algorithm at previous states are themselves updatable and accessible. 
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7 Appendix: Proofs 

Proof of Lemma 1: We consider the three cases for membership of v to maxi- 
m.um independent sets in G. Let m be the cardinaUty of a maximum independent 
set in G. We establish now the first bi-conditional. The second bi-conditional fol- 
lows directly from the analysis for the first one. 

(a) Assume that v belongs to a maximum independent set / of G. In this 
case, v' can be added to / obtaining an independent set of G". In this case 
\I\j{v'}\ > m + 1. 

Assume that v does not belong a some maximum independent set /' of G". If 
V ^ /', then some of its neighbors belong to /', and then, v' ^ /'. In consequence, 
I' is also a maximum independent set of G. Then, |/'| = m. But this is not 
possible, because the size of independent set of /' is at least m -I- 1 . 

(b) Assume that v does not belong to any maximum independent sets of G. 
Then, some of it neighbors can be found in every maximum independent set of 
G, and none of them can be extended with v' to become an independent set of 
G'. 

So. all the maximum independent set of G are maximum independent sets of 
G" of size m. 

Assume, that v belongs to all maximum independent sets of G". Then none 
of the neighbors of v can be found in independent sets of G, and then v' can 
be found in all the maximum independent sets of G'. Since the maximum inde- 
pendent sets of G' have at least cardinality to, it must hold that the maximum 
independent sets of G' have cardinality at least to -I- 1. Then the deleting v' from 
all the maximum independent sets of G' will give us independent sets of G of 
size at least to, i.e. maximum independent sets of G. To all of them v belongs. 
A contradiction. D 

Proof of Lemma 2: (sketch) Hang a rhombus from v, i.e. add three other 
vertices, two of them connected to v, and the third one. connected to the two 
previous ones. Then, reason by cases as in the proof of Lemma 1. □ 

Proof of Proposition 1: For the first claim, given a graph G and a vertex v, 
build in polynomial time the graph G' as in Lemma 1. It holds that v belongs 
to some maximum independent set of G iff w belongs to every maximum inde- 
pendent set of G'. Now, V belongs to every maximum independent set of G' iff 
[maximum independent set in G'[ — [maximum independent set in G| = 1. 

Since computing the maximum cardinality of a clique can be done in time 
ppNP(iog(n)) J24] (see also [31, theorem 17.6]), computing the maximum cardi- 
nality of an independent set can be done in the same time (just consider the 
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complement graph). In consequence, in order to decide about v and G, we can 
compute the cardinahties of the maximum independent set for G and G' in 2 
times FP *■ "^(""^ and next compute their difference. It total, we can perform 
the whole computation in FP ^ °9("))_ jn consequence, by definition of class 
FP ' °9("))^ "vve can decide by means of a polynomial time machine that makes 
0{log{n)) calls to an NP oracle, i.e. the decision is made in time p-'v^('°9(")). 
The same proof works for the second claim. It can also be obtained from the 
first claim and Lemma 2. □ 

Proof of Corollary 1: Construct the conflict graph for the instance wrt the 
FDs. An atomic ground query is consistently true if the corresponding vertex 
in the conflict graph belongs to all the maximum independent sets. Then use 
Proposition 1. D 

Proof of Proposition 2: Consider a graph G = {V, E), and assume the vertices 
of G are uniquely labelled. Consider the database schema with two relations, 
Vertex{v) and Edges{vi,V2Te)^ and the denial constraint \fviV2~'{Ve'rtex{vi) A 
Vertex{v2) A Edges {vi,V2,e)). Vertex stores the vertices of G. For each edge 
{wi,f2} in G, Edges contains n tuples of the form (wi,U2,i), where n is the 
number of vertices in G. All the values in the third attribute of Edges are differ- 
ent, say from 1 to n\E\. The size of the database instance obtained trough this 
padding of G is still polynomial in size. 

This instance is highly inconsistent, and its C-repairs are all obtained by 
deleting vertices, i.e. elements of Vertex alone. In fact, an instance such that all 
tuples but one in Vertex are deleted, but all tuples in Edges are preserved is a 
consistent instance. In this case, n — 1 tuples are deleted. If we try to achieve a 
repair by deleting tuples from Edges, say {vi, V2,i), then in every repair of that 
kind all the n tuples of the form (wi, V2,i) have to be deleted as well. This would 
not be a minimal cardinality repair. 

Assume that / is a maximum cardinality independent set of G. The deletion 
of all tuples (u) from Vertex, where v does not belong to /, is a C-repair. Now, 
assume that D is a repair. As we know, only tuples from Vertex may be deleted. 
Since, in order to satisfy the constraint, no two vertices in the graph that belong 
to D are adjacent, the vertices remaining in Vertex form an independent set in 
G. 

In general, the number of deleted tuples is equal to n — |/|, where / is an 
independent set represented by a repair. So each minimal cardinality repair cor- 
responds to a maximum independent set and vice-versa. □ 

Proof of Corollary 2: This follows from Proposition 2, the fact that C-repairs 
correspond to maximum cliques in the complement of the conflict graph [4] , and 
the P'^^('°^'^"^'-completeness of determining the size of a maximum clique [24]. D 

Proof of Proposition 3: We use the conflict hypergraph. The problem of 
determining the maximum clique size for hypergraphs is in FP ^ "am ^y ^j-^g 
same argument as for conflict graphs: Deciding if the size of maximum clique is 
greater than k is in NP. So, by asking a logarithmic number of NP queries, we 
can determine the size of maximum clique. 
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The membership to pNP{iog(n)) ^f qq^ f^j. ^j^g C-semantics still holds for 
conjunctive queries without existential variables. In fact, given an inconsistent 
database D, a query Q, and a ground tuple t, we check if t is consistent answer 
to Q from D as follows: Check if t is an ordinary answer to Q in D (without 
considering the constraints). If not, the answer is no. 

Otherwise, let ti, . . . ,tk be the database tuples which are answers to Q in 
D and produce t as an answer. Since Q does not contain existential variables, 
only one such set exists. Compute the size of a maximum independent set for the 
graph representation of D, say m^. Compute the size of a maximum independent 
set for the graph representation of D \ {ii}, say mi. If rrii = riiQ, then there 
exist a maximum independent set of D that docs not contain ti. So, there exists 
a minimum repair that does not satisfy that t is an answer to Q. If mi < tuq, 
repeat this procedure for all tuples in fi, . . . , t^. Thus, we have to pose k queries 
(that is determined only by the size of the query) to an FP ' °ff(")) oracle. 

In consequence, the complexity of CQA for conjunctive queries without ex- 
istential variables is in pNP{iog{n))_ □ 

Proof of Lemma 3: The new graph G" consists of two copies of G, say Gi,G2, 
two additional graphs, Ik, Ik+i, and two extra vertices t, b. Subgraph I^ consists 
of k mutually disconnected vertices; subgraph Ik+i consists of fc + 1 mutually 
disconnected connected vertexes. Each vertex of Gi is adjacent to each vertex of 
Ik, and each vertex of G2 is adjacent to each vertex of Ik+i- Each vertex of Ik is 
adjacent to t, and each vertex of Ik+i is adjacent to b. Finally, t, b are connected 
by an edge (c.f. Figure 1). 

We claim that vertex t belongs to all maximum independent sets of G' iff the 
cardinality of maximum independent set of G is equal to k. To prove this claim, 
we consider a few, but representative possible cases. With /(G) we denote an 
arbitrary maximum independent set of G. 

1. |/(G)| < fc — 1; The maximum independent set of G' is Ik U Ik+i', with 
cardinality 2fc + 1. 

2. |/(G)| = fc — 1: The maximum independent sets of G' arc (a) I{Gi) U Ik+i U 
{t}, and (b) Ik U Ik+i, with cardinality 2fc + 1. 

3. |/(G)| = fc: The maximum independent set of G' is Ik+i U I{Gi) U {t}, with 
cardinality 2fc + 2. 

4. I /(G) I = fc + 1: The maximum independent sets of G' arc (a) Gi U G2 U {i}, 
(b) Gi U G2 U {&}, (c) Gi U /fc+i U {i}; with cardinality 2fc + 3. 

5. |/(G)| > fc + 1: The maximum independent sets of G' arc (a) Gi U G2 U {t}, 
(b) Gi U G2 U {&}; with cardinality 2|/| + 1. 

Only in case |/(G)| = k,t belongs to all maximum independent sets. □ 

Proof of Proposition 4: By reduction from the following P^^'^'°^'^"^^ -complete 
decision problem [24, theorem 3.5]: Given a graph G and an integer fc, is the size 
of a maximum clique in G equivalent to mod kl 

Assume graph G has n vertices. We can also assume that fc is not bigger than 
n. Now, we pass to the graph G' that is the complement of G: It has the same 
vertices as G, with every two distinct vertices being adjacent in G' iff they are 
not adjacent in G. A maximum independent set of G' is a maximum clique of 
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G and vice-versa. So, the cardinality of a maximum independent set of G" is the 
size of a maximum chquc of G. 

Next, we take advantage of the construction in Lemma 3 (c.f. Figure 1): For 
each m G {/c,2fc,--- , [^ x fcj}, construct the block graph Bjn{G',tm)- (There 
are [n/k] possible solutions to the equation a; = mod k.) All these graphs are 
disconnected from each other. Next, create a new vertex tg and connect it to 
the vertices tm of the blocks Bm{G',tm). It is easy to check that the resulting 
graph, say G, has its size bounded above by 0{n'^). 

It holds that vertex tg does not belong to every maximum independent set of 
G iff the size of maximum independent set of G is equivalent to mod k. So, we 
have a reduction to the complement of our problem, but the class P^P^^°3i'^)) ig 
closed under complement. 

In fact, if the size of maximum independent set of G is not equivalent to 
mod fc, then for every block B in G, there exists a maximum independent 
set Ib of the block B such that is ^ Ib {ts is the top node of block B). The 
maximum independent set of G is {tg} U Us ^s (because there are no edges 
between blocks and between tg and other vertices besides is)- Consider any 
independent set / of G that does not contain tg. The size of the projection of / 
on any block is not greater than the size of the maximum independent set of the 
block; so \I\ < \ [jg Ib\- So, tg belongs to every maximum independent set of G. 

Now, if the size of a maximum independent set of G is equivalent to m,od k, 
then there exists one block Bq such that ts^ belongs to every maximum inde- 
pendent set Ib^ of So, while for all other blocks B there exists Ib such that 
tB i^ Ib- Consider a maximum independent set /( of G that contains tg. 

Every maximum independent set of G that contains tg is of the form {tg} 
union of maximum independent sets from the blocks B other than Bq that 
do not contain their corresponding tB union any maximum independent set of 
Bo \ {tBo}- The size of such a set is s = 1 -|- X^b^^Bo I^(-^)I + (l-^Sol ~ !)• ^ 
maximum independent set / that does not contain ig, is the union of maximum 
independent sets Ib of all the blocks B of G, and its size is equal to X^s l-^sl' 
i.e. s. Then, there exists a maximum independent set that docs not contain tg. □ 

Proof of Theorem 1: Membership follows from Proposition 3. Now wc prove 
hardness. For a graph G and integer fc, we construct a database D, such that the 
consistent answer to a ground atomic query Q can be used to decide if the size 
of a maximum clique of G is equivalent to m,od k (c.f. proof of Proposition 4). 
Construct the graph G as in Proposition 4. Encode graph G as a database incon- 
sistency problem, introducing a unary relation V (for vertices) and E (3-ary), 
where E corresponds to the edge relation in G plus a third padding attribute to 
make changing it more costly. For each vertex v € G, there is a tuple (v) in V. 
Wc also introduce the denial constraint: yviyv2^{V{vi) AV{v2) AE{vi,V2,-)) 
(an underscore means any variable implicitly universally quantified). For each 
edge {'yi,W2} G G, create n different versions {vi,V2,p) in E, as in the proof of 
Proposition 2. The effect of fixing the database wrt the given denial constraint 
may be the removal of tuples representing vertices or/and the removal of tuples 
representing edges. We want to forbid the latter alternative because those repairs 
do not represent maximum independent set; and this is achieved by making them 
more expensive than vertex removal through the padding process. 
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The consistent answer to the query V(tg) is no, i.e. not true in aU repairs, iff 
tg does not belong to aU maximum independent sets of G iff the size of a maxi- 
mum independent set of G' is equivalent to mod k iff the size of a maximum 
clique of G is equivalent to mod k. □ 

Proof of Proposition 5: Membership is proved as for the C-repair semantics 
(c.f. Proposition 3): Repairs correspond to maximum weighted independent sets 
of the associated hypergraph G. The weight of a maximum weighted indepen- 
dent set can be found in P^^ (as for independent set, but log{0{2^)) ~ poly{n) 
oracle calls are required). To check if a vertex v belongs to all maximum weighted 
independent sets, it is good enough to compute weights of maximum indepen- 
dent sets for G and G \ {v}. D 

Proof of Proposition 6: For denial constraints tuple deletions do not intro- 
duce any violations, so we consider a sequence U consisting of tuple insertion 
and updates. 

Assume that k of the m inserted tuples violate ICs, perhaps together with 
some tuples already in D. If we delete k violating tuples, then we get a consis- 
tent database D'; so a minimal repair is at a distance less than or equal to k 
from D. To find all minimal repairs it is good enough to check no more than 

A^=l , ) + ( 2 )^'''^( k ) '"spE^irs, where \D\ = n. li m is 

small, say less than c • n, then A^ < fcl , I < ml j < mn™. Thus, the 

incremental complexity of the CQA is polynomial wrt n. 

In case U contains change updates, the proof is essentially the same, but the 
role of m is taken hy m-a, where a is the maximum arity of the relations involved. 
This is because we have to consider possible changes in different attributes. □ 

Proof of Proposition 7: First, it is known that the problem of, given a graph 
G and a number fc, determining if there exists a vertex cover of size less than or 
equal to k is in FPT [15]. We will use this problem to solve ours. 

Now, let us assume that we have a consistent database D of size n, and we 
update it inserting k new tuples, obtaining an inconsistent database D' with 
conflict graph G. The size of G is 0{n) by our assumption on the size of m in 
comparison with n. Every C-repair of D' is a maximum independent set of G, 
and can be obtained by deleting from G a minimum vertex cover, because the 
problems are complementary. So, a minimum vertex cover corresponds to the 
vertices that are to be deleted to obtain a repair. 

Since the original database D is consistent, the vertices of G corresponding 
to database tuples in D are all disconnected from each other. In consequence, 
edges may appear only by the update sequence, namely between the m new 
tuples or between them and the elements of D. Then, we know that there is a 
vertex cover for G of size m. However, we do not know if it is minimum. 

In order to find the size of a minimum vertex cover of G, we may start doing 
binary search from m, applying an FPT algorithm for vertex cover. Each check 
for vertex cover, say for value TOi, can be done in 0(1.2852™ -f-rrii -ri) [10]. Then 
log{m) checks take time 0{log{m) ■ (1.2852™ -I- m • n)) < 0{f{m) ■ n), with / 
an exponential function in to. So, it is in FPT obtaining the size of a minimum 
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vertex cover for G, which gives us the minimum number of tuples to remove to 
restore consistency. 

Now, for CQA we want to check if a vertex R{t) belongs to all maximum 
independent sets of G, which happens if it does not belong to any minimum 
vertex covers. This can be determined by checking the size of minimum ver- 
tex cover for G' and G" \ {i?(t)}. If they are the same, then R{t) belongs to all 
maximum independent sets and the consistent answer to the query R{t) is yes. □ 

Proof of Proposition 8: By uniform reduction from the MONOTONE W[l]- 
hard problem [14] WEIGHTED MONOTONE 3CNF SAT, which is defined as 
follows: Given a 3CNF monotone circuit C and an integer /c, is it possible to 
make exactly k of the inputs 1 and obtain output 1 for CI 

The database schema consists of relations C/awse(C, Fi, V2, V3,p), Var{V), 
Cond{X, Y), where p and Y are dummy variables intended to create many copies 
of a tuple, to forbid the deletion of those tuples by making the potential repair 
too costly. The integrity constraint is yCViV2Vzpy^{Clause{C,Vi,V2jVz,p) A 
Var{Vi) A Var{V2) A Var{Vz) A Cond{l,y)). Given a monotone 3CNF formula 
!?■ = -01 A ?/'2 A • ■ • A i/jm and a parameter k, for each clause if^i = (^^ii V s^j V x^g ), 
where the Xi. are atoms, store in Clause n copies of the form (i,Xi-^,Xi^,Xi^,p) 
(replace variable by any new constant if a clause has less than three variables). 
For each variable x in ^ , store x in Var. C is initially empty. The resulting 
database is consistent. 

Now, on the update part, insert (1, i) into Cond. i — 1, . . . fc. Then there exists 
an assignment with weight less than k iff Cond{l, 1) is false in every repair. 

Since we have to determine if there exists a satisfying assignment with weight 
exactly fc, it is good enough to ask a query to two databases, built as before, 
but for both fc and fc + 1, which is compatible with the definition of parametric 
reduction, that allows to use of a constant number of instances. In our case, since 
we have that: (a) if weight < fc, then consistent answer is yes, (b) if weight is 
equal to fc, then the consistent answer is false (i.e. false in all repairs), and (c) if 
weight > fc, the consistent answer is false. So, we construct two instances, for fc 
and fc + 1 . The weight is equal to fc iff the consistent answer for the first instance 
is false and for the second one it is yes. □ 

Proof of Proposition 9: By reduction from static CQA for (existentially quan- 
tified) conjunctive queries and denial ICs under minimal set semantics, which is 
CO A'P-hard [11]. Consider an instance for this problem consisting of a database 
D, a set of denial ICs IC, and a query Q. 

For every denial ic £ IC, pick up a relation i?'^ in it and expand it to a rela- 
tion i?"^ with an extra attribute Control. Also add a new, one attribute relation 
Controler{A). Next, transform each integrity constraint ic : Va;-i(P(a;) A • • ■ A 
R^'^{x)A- ■ •A7) into ic' : VxVconfr-i(P(x)A- • ■Ai?'^(x, contr)/\Controler{contr)/\ 
7). We obtain a set IC' of denial constraints. The original database D is extended 
to a database D with the new relation Controler, which is initially empty, and 
the relations i?"=, whose extra attributes Contr initially take all the value 1. Due 
to the extension of Controler, IC is satisfied. 

Now in the incremental context, we consider the inconsistent instance D 
obtained via the update insert{Controler{\)) on D. The S-repairs of D wrt IC' 
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are: (a) D and (b) all the S-repairs of D (plus the tuple Controler{l) in each of 
them), which are in one-to-one correspondence with the S-repairs of D wrt IC . 
Now, for a conjunctive query Q in the language of D, produce the conjunctive 

query Q' : 3 • • • y^c ■ ■ ■ Q nw- T i^ ^l^*^ language of D,^ where each atom 
R'^'^{x)) in Q is replaced by 3yicR'-^{x,yic)- 

Notice that all the repairs in (b) are essentially contained in D, except for 
the tuple Controler{l), whose predicate does not appear in the queries. This is 
because denial constraints are obtained by tuple deletions. In consequence, any 
answer to the conjunctive (and then monotone) query in a repair in (b) is also 
an answer in the repair in (a). In consequence, the repair D does not contribute 
with any new consistent answers, neither invalidates any answers obtained by 
the repairs in (b). So, it holds Cqa{Q, D, IC) = Cqa{Q' ,d' , IC'). D 

Proof of Theorem 3: We can adapt the proof of theorem 4 in [6] about the Z\f- 
hardness of CQA under minimum square distance. We provide a LOCSPACE- 
reduction from the following problem [24, theorem 3.4]: Given a Boolean formula 
'tjj{Xi,--- ,Xn) in 3CNF, decide if the last variable X„ is equal to 1 in the 
lexicographically maximum satisfying assignment (the answer is No if tp is not 
satisfiable) . 

Create a database schema with relations: Clause{id, Var^^, Vali, Var2, Val2, 
Var^, Val'i), Var(var,val), Dummy{x), with denial constraints: 
\/var, val^{Var{var, val) A val 7^ A val ^ 1), 

Vid, wi, xi, 1)2, X2,V3, X3^{Cl{id, vi,xi,V2,X2,V3,X3)AVar{., vi,x'i)AVar{., V2, x'2) 
A Var{_, W3, Xg) l\ X\^ x[ A X2 ^ x'2 /\ X3 ^ x'^ A Dummy{l)). 
The last denial can be replaced by 8 denial constraints without inequalities 
considering all the combination of values for xi, a;2, X3 in {0, 1}. 

Assume now that Ci, . . . ,Cm are the clauses in ^>. For each propositional 
variable Xi store in table Var the tuple (Xi, 0), with weight 1, and (Xi, 1) with 
weight 2"^*. Store tuple 1 in Dummy with weight 2" x 2. For each clause Ci = 
hi ^ U2 V ^43 , store in Clause the tuple (Ci , X^^ , /^ ^ , X^^ ,li2, Xi^ ,Ui), where U- is 
equal to 1 in case of positive occurrence of variable Xi . in C^; and to 0, otherwise. 
For example, for Ce = Xg V -^Xg V X12, we store [GQ^Xa, l,X9,0,Xi2, 1). The 
weight of this tuple is 2" . 

Then the answer to the ground atomic query Var{Xi, 1) is yes iff the vari- 
able Xi is assigned value 1 in the lexicographically maximum assignment (in 
case such a satisfying assignment exists). In case a satisfying assignment does 
not exist, then the tuple in Dummy has to be changed in order to satisfy the 
constraints. No attribute value in a tuple in Clause is changed, because the cost 
of such a change is higher than a change in the Dummy relation. □ 

Proof of Lemma 4: If a vertex v in G has degree 2, then we transform it into 
a vertex of degree 4 by hanging from it an "ear" as shown in the figure, which 
is composed of three connected versions of the graph H^ [20, Theorem 2.3] plus 
two interconnected versions of a box graph (c.f. figure below). 



E^ means the expression obtained by replacing in expression E the subexpression 
El by expression £2. 
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It is easy to see that the "ear" is regular of degree 4, is 3-colorable (as shown in 
the picture with colors r,g,b), but not planar. Hanging the car adds a constant 
number of vertices. Now we have to deal with the set Vodd of vertices of degree 1 
or 3 (vertices of degree can be ignored). By Euler's theorem, Vodd has an even 
cardinality. This makes it possible to pick up disjoint pairs {vi, V2} of elements 




of Vodd , leaving every vertex coupled to 
some other vertex. For each such pair, 
{i'i,'y2}, add an extra vertex v' con- 
nected to (only) vi and V2. This trio is 
3-colorablc. Now vi, V2 have degree 2 or 
4. From those that become of degree 
2, hang the "ear" as before. In this way, 
all the nodes become of degree 4. The 
number of added vertices is polynomial 
in the size of the original graph. The 
4-colorability of G" follows from the 
4-colorability of G (every planar graph 
is 4-colorable) and the 4-colorability of 
the hanging ears. □ 



Proof of Corollary 3: From Lemma 4 and the A^P-hardness of 3-colorability 
for planar graphs with vertices of degree at most 4 [20] . □ 

Proof of Theorem 4: If the update operation U is a delete of a database atom, 
we reduce to our problem 3-Colorability of planar graphs G with vertex degree 
at most 4, which is TVP-complete [20]. Given such a non-empty graph G, we 
construct graph G' as in Lemma 4, which is also 4-colorable (because G is and 
the ears too). 

Let E{X, Y) be a database relation encoding the edges of the graph, Coloring 
a 2-ary database relation storing a coloring of the vertices, and Colors a unary 
relation storing the four colors allowed. Notice that a 4-coloring of G can be 
found in polynomial time [32]. Then also a 4-coloring for G" can be found in 
polynomial time (a 4-coloring for the cars can be given once and for all). The 
ICs, essentially denials and inclusion dependencies, are as follows: 



Every node is colored: yxy3z(E{x,y) — > Coloring (x, z)). 

Nodes have one color: \/ xyiy2~'{Coloring [x , j/i) A Coloring(x, 1/2) A j/i ^ y2)- 

Colors must be allowed: \/xy{Coloring{x,y) — > Colors{y)). 

Vertex degree is not less than 4: \/x{3yE{x,y) — > 3yiy2yzyi{E{x,yi) A 

E{x, 2/2) A E{x, ys) A E{x, y^) A yi ^ 2/2 A yi ^ 2/3 A yi 7^ 1/4 A 1/2 ^ 2/3 A ^2 7^ 

2/3 A 2/3 7^ Vi))- 

Vertex degree is not bigger than 5: "ixyi ■ ■ ■ y5^(E{x, yi)/\- ■ ■AE{x, y5)Ayi ^ 

2/2--- A2/4 7^2/5)- 

Only vertices are colored: \fxy3z(Coloring{x,y) — > E{x, z)). 

All colors arc used: Wx3z{Colors{x) -^ Coloring {z,x)). 
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8. E is symmetric: yxy{E{x,y) — > E{y,x)). 

9. Adjacent vertices have different colors: 

yxyuw-i{E{x, y) A Coloring(x, u) A Coloringijj, w) A u = w). 

The initial database D stores the graph G", together with its 4-coloring (that 
does use all 4 colors). This is a consistent instance. 

For the incremental part, if the update U is the deletion of a color, e.g. 
delete Colors (c), i.e. of tuple (c) from Colors, the instance becomes inconsistent, 
because an inadmissible color is being used in the coloring. Since repairs can be 
obtained by changing attribute values in existing tuples only, the only possible 
repairs are the 3-colorings of G" with the 3 remaining colors (if such colorings 
exist), which are obtained by changing colors in the second attribute of Coloring. 
If there are no colorings, there are no repairs. 

The query Q : Colors{c)l is consistently true only in case there is no 3- 
coloring of the original graph G, because it is true in the empty set of repairs. □ 

Proof of Theorem 5: We reason basically as in the proof of theorem 4(c) in 
[6]; just introduce a new relation Dummy, and transform every denial Vj/-i(Ai A 
• • ■ /\ As) there into \/y\/x^{Ai A • ■ • A A^ A Dummy {x)). If we start with the 
empty extension for Dummy, the database is consistent. On the update part, 
if we insert the tuple Dummy {c) into the database, and the original denials 
were inconsistent in the given instance, then we cannot delete that tuple and no 
change in it can repair any violations. Thus, the only way to repair database is 
as in [6], which makes CQA P^-'^-hard. □ 
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